68 research outputs found

    Peak Alignment of Gas Chromatography-Mass Spectrometry Data with Deep Learning

    Full text link
    We present ChromAlignNet, a deep learning model for alignment of peaks in Gas Chromatography-Mass Spectrometry (GC-MS) data. In GC-MS data, a compound's retention time (RT) may not stay fixed across multiple chromatograms. To use GC-MS data for biomarker discovery requires alignment of identical analyte's RT from different samples. Current methods of alignment are all based on a set of formal, mathematical rules. We present a solution to GC-MS alignment using deep learning neural networks, which are more adept at complex, fuzzy data sets. We tested our model on several GC-MS data sets of various complexities and analysed the alignment results quantitatively. We show the model has very good performance (AUC 1\sim 1 for simple data sets and AUC 0.85\sim 0.85 for very complex data sets). Further, our model easily outperforms existing algorithms on complex data sets. Compared with existing methods, ChromAlignNet is very easy to use as it requires no user input of reference chromatograms and parameters. This method can easily be adapted to other similar data such as those from liquid chromatography. The source code is written in Python and available online

    Informative and misinformative interactions in a school of fish

    Get PDF
    It is generally accepted that, when moving in groups, animals process information to coordinate their motion. Recent studies have begun to apply rigorous methods based on Information Theory to quantify such distributed computation. Following this perspective, we use transfer entropy to quantify dynamic information flows locally in space and time across a school of fish during directional changes around a circular tank, i.e. U-turns. This analysis reveals peaks in information flows during collective U-turns and identifies two different flows: an informative flow (positive transfer entropy) based on fish that have already turned about fish that are turning, and a misinformative flow (negative transfer entropy) based on fish that have not turned yet about fish that are turning. We also reveal that the information flows are related to relative position and alignment between fish, and identify spatial patterns of information and misinformation cascades. This study offers several methodological contributions and we expect further application of these methodologies to reveal intricacies of self-organisation in other animal groups and active matter in general

    Feature selection for chemical sensor arrays using mutual information

    Get PDF
    We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays

    Oxygen-sensing neurons reciprocally regulate peripheral lipid metabolism via neuropeptide signaling in <i>Caenorhabditis elegans</i>

    Get PDF
    <div><p>The mechanisms by which the sensory environment influences metabolic homeostasis remains poorly understood. In this report, we show that oxygen, a potent environmental signal, is an important regulator of whole body lipid metabolism. <i>C</i>. <i>elegans</i> oxygen-sensing neurons reciprocally regulate peripheral lipid metabolism under normoxia in the following way: under high oxygen and food absence, URX sensory neurons are activated, and stimulate fat loss in the intestine, the major metabolic organ for <i>C</i>. <i>elegans</i>. Under lower oxygen conditions or when food is present, the BAG sensory neurons respond by repressing the resting properties of the URX neurons. A genetic screen to identify modulators of this effect led to the identification of a BAG-neuron-specific neuropeptide called FLP-17, whose cognate receptor EGL-6 functions in URX neurons. Thus, BAG sensory neurons counterbalance the metabolic effect of tonically active URX neurons via neuropeptide communication. The combined regulatory actions of these neurons serve to precisely tune the rate and extent of fat loss to the availability of food and oxygen, and provides an interesting example of the myriad mechanisms underlying homeostatic control.</p></div

    Pheromone-sensing neurons regulate peripheral lipid metabolism in <i>Caenorhabditis elegans</i>

    Get PDF
    It is now established that the central nervous system plays an important role in regulating whole body metabolism and energy balance. However, the extent to which sensory systems relay environmental information to modulate metabolic events in peripheral tissues has remained poorly understood. In addition, it has been challenging to map the molecular mechanisms underlying discrete sensory modalities with respect to their role in lipid metabolism. In previous work our lab has identified instructive roles for serotonin signaling as a surrogate for food availability, as well as oxygen sensing, in the control of whole body metabolism. In this study, we now identify a role for a pair of pheromone-sensing neurons in regulating fat metabolism in C. elegans, which has emerged as a tractable and highly informative model to study the neurobiology of metabolism. A genetic screen revealed that GPA-3, a member of the Gα family of G proteins, regulates body fat content in the intestine, the major metabolic organ for C. elegans. Genetic and reconstitution studies revealed that the potent body fat phenotype of gpa-3 null mutants is controlled from a pair of neurons called ADL(L/R). We show that cAMP functions as the second messenger in the ADL neurons, and regulates body fat stores via the neurotransmitter acetylcholine, from downstream neurons. We find that the pheromone ascr#3, which is detected by the ADL neurons, regulates body fat stores in a GPA-3-dependent manner. We define here a third sensory modality, pheromone sensing, as a major regulator of body fat metabolism. The pheromone ascr#3 is an indicator of population density, thus we hypothesize that pheromone sensing provides a salient 'denominator' to evaluate the amount of food available within a population and to accordingly adjust metabolic rate and body fat levels

    Fine-Scale Mapping of the 4q24 Locus Identifies Two Independent Loci Associated with Breast Cancer Risk

    Get PDF
    Background: A recent association study identified a common variant (rs9790517) at 4q24 to be associated with breast cancer risk. Independent association signals and potential functional variants in this locus have not been explored. Methods: We conducted a fine-mapping analysis in 55,540 breast cancer cases and 51,168 controls from the Breast Cancer Association Consortium. Results: Conditional analyses identified two independent association signals among women of European ancestry, represented by rs9790517 [conditional P = 2.51 × 10−4; OR, 1.04; 95% confidence interval (CI), 1.02–1.07] and rs77928427 (P = 1.86 × 10−4; OR, 1.04; 95% CI, 1.02–1.07). Functional annotation using data from the Encyclopedia of DNA Elements (ENCODE) project revealed two putative functional variants, rs62331150 and rs73838678 in linkage disequilibrium (LD) with rs9790517 (r2 ≥ 0.90) residing in the active promoter or enhancer, respectively, of the nearest gene, TET2. Both variants are located in DNase I hypersensitivity and transcription factor–binding sites. Using data from both The Cancer Genome Atlas (TCGA) and Molecular Taxonomy of Breast Cancer International Consortium (METABRIC), we showed that rs62331150 was associated with level of expression of TET2 in breast normal and tumor tissue. Conclusion: Our study identified two independent association signals at 4q24 in relation to breast cancer risk and suggested that observed association in this locus may be mediated through the regulation of TET2. Impact: Fine-mapping study with large sample size warranted for identification of independent loci for breast cancer risk

    Fine-mapping of prostate cancer susceptibility loci in a large meta-analysis identifies candidate causal variants

    Get PDF
    Prostate cancer is a polygenic disease with a large heritable component. A number of common, low-penetrance prostate cancer risk loci have been identified through GWAS. Here we apply the Bayesian multivariate variable selection algorithm JAM to fine-map 84 prostate cancer susceptibility loci, using summary data from a large European ancestry meta-analysis. We observe evidence for multiple independent signals at 12 regions and 99 risk signals overall. Only 15 original GWAS tag SNPs remain among the catalogue of candidate variants identified; the remainder are replaced by more likely candidates. Biological annotation of our credible set of variants indicates significant enrichment within promoter and enhancer elements, and transcription factor-binding sites, including AR, ERG and FOXA1. In 40 regions at least one variant is colocalised with an eQTL in prostate cancer tissue. The refined set of candidate variants substantially increase the proportion of familial relative risk explained by these known susceptibility regions, which highlights the importance of fine-mapping studies and has implications for clinical risk profiling. © 2018 The Author(s).Prostate cancer is a polygenic disease with a large heritable component. A number of common, low-penetrance prostate cancer risk loci have been identified through GWAS. Here we apply the Bayesian multivariate variable selection algorithm JAM to fine-map 84 prostate cancer susceptibility loci, using summary data from a large European ancestry meta-analysis. We observe evidence for multiple independent signals at 12 regions and 99 risk signals overall. Only 15 original GWAS tag SNPs remain among the catalogue of candidate variants identified; the remainder are replaced by more likely candidates. Biological annotation of our credible set of variants indicates significant enrichment within promoter and enhancer elements, and transcription factor-binding sites, including AR, ERG and FOXA1. In 40 regions at least one variant is colocalised with an eQTL in prostate cancer tissue. The refined set of candidate variants substantially increase the proportion of familial relative risk explained by these known susceptibility regions, which highlights the importance of fine-mapping studies and has implications for clinical risk profiling. © 2018 The Author(s).Peer reviewe

    Functional mechanisms underlying pleiotropic risk alleles at the 19p13.1 breast-ovarian cancer susceptibility locus

    Get PDF
    A locus at 19p13 is associated with breast cancer (BC) and ovarian cancer (OC) risk. Here we analyse 438 SNPs in this region in 46,451 BC and 15,438 OC cases, 15,252 BRCA1 mutation carriers and 73,444 controls and identify 13 candidate causal SNPs associated with serous OC (P=9.2 × 10-20), ER-negative BC (P=1.1 × 10-13), BRCA1-associated BC (P=7.7 × 10-16) and triple negative BC (P-diff=2 × 10-5). Genotype-gene expression associations are identified for candidate target genes ANKLE1 (P=2 × 10-3) and ABHD8 (P<2 × 10-3). Chromosome conformation capture identifies interactions between four candidate SNPs and ABHD8, and luciferase assays indicate six risk alleles increased transactivation of the ADHD8 promoter. Targeted deletion of a region containing risk SNP rs56069439 in a putative enhancer induces ANKLE1 downregulation; and mRNA stability assays indicate functional effects for an ANKLE1 3′-UTR SNP. Altogether, these data suggest that multiple SNPs at 19p13 regulate ABHD8 and perhaps ANKLE1 expression, and indicate common mechanisms underlying breast and ovarian cancer risk
    corecore